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The current interest in performance assessment grew out of the 
widespread dissatisfaction with standardized tests, and of the widespread 
belief that schools do not develop productive citizens. The purpose of this 
paper is to review the theoretical and empirical literature relevant to 
recent trends in language performance assessment. Following a definition 
of performance assessment, this paper considers: (1) theoretical 
assumptions underlying performance assessment; (2) purposes of 
performance assessment; (3) performance assessment procedures; (4) 
merits and demerits of performance assessment; (5) language performance 
assessment formats and research relevant to each format; (6) criteria for 
selecting performance assessment formats; (7) alternative groupings for 
assessing student performance; (8) performance assessment via computers 
and research related to this area; (9) reliability and validity of 
performance assessment and research related to this area; and (10) 
conclusions drawn from the literature reviewed in this paper. 
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Chapter One 



Background Information 

1.0 Introduction 

The current interest in performance assessment grew out of the widespread 
dissatisfaction with standardized tests (Bachman, 2000), and of the widespread 
belief that schools do not develop productive citizens (Roeber, 1995). The 
purpose of this paper is to review the literature relevant to this type of 
assessment over the last ten years. 

1.1 Definition of Performance Assessment 

As defined by Nitko (2001) performance assessment is the type of assessment that 
“(1) presents a hands-on task to a student and (2) uses clearly defined criteria to 
evaluate how well the student achieved the application specified by the learning 
target” (p. 240). Nitko goes on to state that “[t]here are two aspects of a student’s 
performance that can be assessed: the product the student produces and the 
process a student uses to complete the product” (p. 242). 

In their Dictionary of Language Testing, Davies et al. (1999) define performance 
assessment as “a test in which the ability of candidates to perform particular 
tasks, usually associated with job or study requirements, is assessed” (p. 144). 
They maintain that this performance test “uses ‘real life’ performance as a 
criterion and characterizes measurement procedures in such a way as to 
approximate non-test language performance” (loc. cit.). 

Kunnan (1998) states that performance assessment is “concerned with 
language assessment in context along with all the skills and not in discrete-point 
items presented in a decontextualized manner” (p. 707). He adds that in this type 
of assessment “test takers are assessed on what they can do in situations similar to 
‘real life’” (loc. cit.). 

Thurlow (1995) states that performance assessment “require[s] students to 
create an answer or product that demonstrates their knowledge and skills” (p. 1). 
Similarly, Pierce and O'Malley (1992) define performance assessment as “an 
exercise in which a student demonstrates specific skills and competencies in 
relation to a continuum of agreed upon standards of proficiency or excellence” (p. 
2 ). 
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As indicated--from the aforementioned definitions-performance assessment 
focuses on the following: 

(a) application of knowledge and skills in realistic situations, 

(b) open-ended thinking, 

(c) wholeness of language, and 

(d) processes of learning as well as the products of these processes. 



1.2 Theoretical Assumptions Underlying Performance 
Assessment 

Performance assessment is consistent with modern learning theories. It reflects 
the cognitive learning theory which suggests that students must acquire both 
content and procedural knowledge. Since particular types of procedural 
knowledge are not assessable via traditional tests, cognitivists call for 
performance to assess this type of knowledge (Popham, 1999). 

Performance assessment is also compatible with Howard Gardner’s (1993, 
1999) theory of multiple intelligences because this type of assessment has the 
potential of permitting students' achievements to be demonstrated and evaluated 
in several different ways. In an interview with Checkley (1997), Gardner himself 
expresses this idea in the following way: 

The current emphasis on performance assessment is well supported by 
the theory of multiple intelligences. . . . [L]et’s not look at things 
through the filter of a short-answer test. Let’s look instead at the 
performance that we value, whether it is linguistic, logical, aesthetic, or 
social performance .... let’s never pin our assessments of 
understanding on just one particular measure, but let’s always allow 
students to show their understanding in a variety of ways. 

Furthermore, performance assessment is consistent with the constructivist 
theory of learning which views learners as active participants in the construction 
and evaluation of their learning processes and products. Based on this theory, 
performance assessment involves students in the process of assessing their own 
performance (Shepard, 2000). 

1.3 Purposes of Performance Assessment 

A large portion of performance assessment literature (e.g., Arter et al., 1995; 
Katz, 1997; Khattri et al., 1998; Tunstall and Gipps, 1996) indicates that 
performance assessment serves the following purposes: 

(a) documenting students’ progress over time, 

(b) helping teachers improve their instruction, 

(c) improving students’ motivation and increasing their self-esteem, 

(d) helping students improve their own learning processes and products, 
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(e) making placement or certification decisions, and 

(f) providing parents and community members with directly observable 
products concerning students’ performance. 

1.4 Performance Assessment Procedures 

The major stages of performance assessment, synthesized from a number 
sources (Cheng, 2000; Gallagher, 1998; Martinez, 1998; Nitko, 2001; Palomba 
and Banta, 1999; Shaklee et al. 1997; Stiggens, 1994; Wiggins, 1993), are the 
following: 

(1) Deciding What to Assess and How to Assess It 

At this stage, the teacher should become quite clear about what he/she will 
assess. He/she should also become quite clear about how he/she will assess 
students’ performance. More specifically, the teacher at this stage needs to 
address questions such as the following: 

(a) Which learning target(s) will I assess? 

(b) Will the test task(s) assess the processes, or the products of 
students’ learning, or both? 

(c) Should my students be involved in assessing their own 
performance? 

(d) Should I use holistic or analytic assessment rubrics? 

(2) Developing Assessment Tasks and Performance Rubrics 

In light of the answers to stage-one questions, the teacher develops the 
assessment tasks and performance rubrics. In doing so, he/she must make 
sure that students will understand what he/she expects them to do. After 
developing the assessment tasks and performance rubrics, the teacher 
should pilot them on subjects that represent the target test population to 
identify problems and remove them. 

(3) Assessing Students’ Performance 

In light of the performance rubrics-created at the second stage — the 
teacher scores students’ performance. The questions the teacher might 
answer at this stage include: 

(a) What are the strengths in the student’s performance? 

(b) What are the weaknesses in the student’s performance? 

(c) What evidence of self-, peer- or group assessment appears in the 
student’s performance? 

(4) Interpreting and Reporting Students’ Results 

At this stage, the teacher analyses and discusses students’ results in light of 
the teaching strategies he/she used as well the learning strategies students 
employed. In light of these results, the teacher also suggests ways to develop 
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his/her teaching strategies and to improve students’ performance. As 
Wiggins (1993) puts it: 



Assessment should improve performance, not just audit it.... 
Assessment done properly should begin conversations about 
performance not end them... .If the testing we do in the name of 
accountability is still one event, year-end testing, we will never 
obtain valid and fair information, (pp. 5, 13, 267) 

At this stage, the teacher should also create a performance-based report 
card. This card should focus on reporting the strengths and weaknesses of 
the student performance instead of numerical grades (Fleurquin, 1998; 
Stix, 1997). Simply, the teacher must address the following questions at this 
stage: 

(a) What do these results tell me about the effectiveness of the 
instructional program? 

(b) What kind of evidence will be useful to me and to my students? 

(c) How can I report my students’ results? 

1.5 Merits and Demerits of Performance Assessment 

The advantages of performance assessment include its potential to assess ‘doing,’ 
its consistency with modern learning theories, its potential to assess processes as 
well as products, its potential to be linked with teaching and learning activities, 
and its potential to assess language as communication (Brualdi, 1998; Linn and 
GronLound, 1995; Mehrens, 1992; Stiggins, 1994). Although performance 
assessment offers these advantages over traditional assessment, it also has some 
distinct disadvantages. The first disadvantage is that performance assessment 
tasks take a lot of time to complete (Oosterhof, 1994). If such tasks are not part 
the instructional procedures, this means either administering fewer tasks (thereby 
reducing the reliability of the results), or reducing the amount of instructional 
time (Nitko, 2001). The second disadvantage is that performance assessment tasks 
do not assess all learning targets well, particularly in the situations where some 
learning targets focus on bits and pieces of information (Bailey, 1998; Soodak, 
2000). The third disadvantage is that the scoring of performance tasks takes a lot 
of time (Rudner and Boston, 1994). The fourth disadvantage is that scores from 
performance tasks may have lower scorer reliability (Fuchs, 1995; Hutchinson, 
1995; Koretz et al., 1994; Miller and Legg, 1993). The fifth and final disadvantage 
is that performance tasks may be discouraging to less able students (Gomez, 
2000; Meisles et al., 1995). 
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Chapter Two 

Performance Assessment Formats 



2.0 Introduction 

Assessment specialists (e.g., Cohen, 1994; Genesee and Upshur, 1996; Nitko, 
2001; Popham, 1999; Stiggins, 1994) have proposed a wide range of 
alternatives for assessing students’ performance. Such alternatives fall into two 
major categories: (I) naturally occurring assessment, and (II) on-demand 
assessment. Each of these categories is described below. 

2.1 Naturally Occurring Assessment 

This type of assessment refers to observing students’ normally occurring 
performance in naturalistic environments without intervening or structuring 
the situation, and without informing the students that they are being 
assessed (Fisher, 1995; Stiggins, 1994; Tompkins, 2000). The major advantage 
of this type of assessment is that it provides a realistic view of a student’s 
language performance (Norris and Hoffman, 1993). Another advantage of 
this type of assessment is that it is not a source of anxiety and psychological 
tensions for the students (Antonacci, 1993). However, this type of assessment 
does not seem practically feasible because of the following shortcomings 
(Nitko, 2001): 

(a) it is difficult and time-consuming to use with large numbers of 
students, 

(b) it is inadequate on its own because it cannot provide the teacher 
with all the data he/she needs to thoroughly assess students’ 
performance, and 

(c) the teacher cannot ensure that all students will perform the same 
tasks under similar conditions. 

Research on Naturally Occurring Assessment 

A survey of research on naturally occurring assessment indicated that 
whereas several studies used this type of assessment as a research tool in 
addition to standardized testing (e.g., Brooks, 1995; Lemons, 1996; 
Mauerman, 1995; Santos, 1998; Wright, 1995), no studies investigated its 
effect on students’ performance. However, indirect support for this type of 
assessment comes from studies which found that test anxiety negatively 
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affected students’ language performance (e.g., Dugan, 1994; Ross, 1995; 
Teemant, 1997). 



2.2 On-Demand Assessment 

Because of the shortcomings of the naturally occurring assessment, many 
assessment formats were developed to elicit certain types of performance. 
These formats include oral interviews, individual or group projects, portfolios, 
dialogue journals, story retelling, oral reading, group discussions, role playing, 
teacher-student conferences, retrospective and introspective verbal reports, 
etc. This section describes the on-demand formats that are well suited for 
assessing language performance. 

2.2.1 Oral Interviews 

Oral interviews are the simplest and most frequently employed format for 
assessing students’ oral performance and learning processes (Fordham et 
al., 1995; McNamara, 1997a; Thurlow, 1995). This format can take 
different forms: the teacher interviewing the students, the students 
interviewing each other, or the students interviewing the teacher (Graves, 
2000 ). 

Chalhoub-Deville (1995) claims that oral interviews offer a realistic 
means of assessing students’ oral language performance. However, 
opponents of this format argue that such interviews are artificial because 
students are not placed in natural, real-life speech situations, and are thus 
susceptible to psychological tensions and to constraints of style and 
register (Antonacci, 1993). They also argue that a face-to-face interview is 
time-consuming because it cannot be conducted simultaneously with more 
than one student by a single interviewer (Weir, 1993). 

Stansfield (1992) suggests that oral interviews should progress through 
the following four stages: 

(a) Warm-up: At this stage the interviewer puts the interviewee at 
ease and makes a very tentative estimate of his/her level of 
proficiency. 

(b) Level checks: During this stage, the interviewer guides the 
conversation through a number of topics to verify the tentative 
estimate arrived at during the previous stage. 

(c) Probes: During this stage the interviewer raises the level of the 
conversation to determine the limitations in the interviewee 
proficiency or to demonstrate that the interviewee can 
communicate effectively at a higher level of language. 



12 




(d) Wind-down: At this stage the interviewer puts the interviewee 
at ease by returning to a level of conversation that the 
interviewee can handle comfortably. 

To effectively integrate oral interviews with language learning, 
Tompkins and Hoskisson (1995) suggest that students can conduct 
interviews with each other or with other members of the community. 
They further suggest that students should record such interviews and 
submit the tapes to the teacher for assessment (ibid.). 

To make interviewing intimately tied to teaching, Maden and Taylor 
(2001) suggest that the interviewer, usually the teacher, should enter into 
interaction with students for both teaching and assessment purposes. 

To make interviewing intimately tied to the ultimate goals of 
assessment, the interviewer should use interview sheets (Lumley and 
Brown, 1996). Such sheets usually contain the questions the interviewer 
will ask and blank spaces to record the student’s responses (ibid.). 
Additionally, audio and video cassettes can be made of oral interviews for 
later analysis and evaluation (Tannenbaum, 1996). 

Stansfield and Kenyon (1996) suggest using a tape-recorded format as 
an alternative to face-to-face interviews. They claim that such a tape- 
recorded format can be administered to many students within a short 
span of time, and that this format can help assessors to control the quality 
of the questions as well as the elicitation procedures (ibid.). 

Alderson (2000) suggests that oral interviews can be extremely helpful 
in assessing students’ reading strategies and attitudes towards reading. In 
such a case, students can be asked about the texts they have read, how 
they liked them, what they did not understand, what they did about this, 
and so on (ibid.). 

Research on Oral Interviews 

A survey of recent research on oral interviews indicated that whereas 
several studies used this format as a research tool for assessing students’ 
oral performance (e.g., Berwick and Ross, 1996; Careen, 1997; Fleming, 
and Walls, 1998; Kiany, 1998; Lazaraton, 1996), and for exploring 
students’ reading strategies (e.g., Harmon, 1996; Vandergrift, 1997), no 
studies used it as an on-going technique for both assessment and 
instructional purposes. 

2.2.2 Individual or Group Projects 

Many educators and assessment specialists (e.g., Greenwald and Hand, 
1997; Gutwirth, 1997; Katz and Chard, 1998; Ngeow and Kong, 2001; 
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Sokolik and Tillyer, 1992) suggest assessing students’ language 
performance with group or individual projects. Such projects are in- 
depth investigations of topics worth learning more about. Such 
investigations focus on finding answers to questions about a topic posed 
either by the students, the teacher, or the teacher working with students 
(Gutwirth, 1997). 

The advantages of using projects for both instructional and assessment 
purposes include helping students bridge the gap between language study 
and language use, integrating the four language skills, increasing 
students’ motivation to learn, taking the classroom experience out into the 
community, using the language in real life situations, allowing teachers to 
assess students’ performance in a relatively non-threatening atmosphere, 
and deepening personal relationships between the teacher and students 
and among the students themselves (Fried-Booth, 1997; Katz, 1997; Katz 
and Chard, 1998; Warschauer et al., 2000). However, project work may 
take a long time and require human and material sources that are not 
easily accessible in the students’ environment. 

Katz and Chard (1998) suggest that a project topic is appropriate if 

(a) it is directly observable in the students’ environment, 

(b) it is within most students’ experiences, 

(c) direct investigation is feasible and not potentially dangerous, 

(d) local resources (e.g., field sites and experts) are readily 
accessible, 

(e) it has good potential for representation in a variety of media 
(e.g., role play, writing), 

(f) parental participation and contributions are likely, 

(g) it is sensitive to the local culture and culturally appropriate in 
general, 

(h) it is potentially interesting to students, or represents an 
interest that teachers consider worthy of developing in 
students, 

(i) it is related to curriculum goals, 

(j) it provides ample opportunity to apply basic skills (depending 
on the age of the students), and 

(k) it is optimally specific — neither too narrow nor too broad. 

Project topics are usually investigated by a small group of students 
within a class, sometimes by a whole class, and occasionally by an 
individual student (Greenwald and Hand, 1997). During project work 
students engage in many activities including reading, writing, 
interviewing, recording observations, etc. (Ngeow and Kong, 2001). 
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Fried-Booth (1997) suggests that a project should move through three 
stages: project planning, carrying out the project, and reviewing and 
evaluating the work. She further suggests that at each of these three 
stages, the teacher should work with the students as a counselor and 
consultant (ibid.). Similarly, Katz (1994) suggests the following three 
stages for project work: 

(a) selecting the project topic, 

(b) direct investigation of the project, and 

(c) culminating and debriefing events. 

Stoller (1997) proposes the following ten-step procedure for integrating 
project work into content-based language classrooms: 

(a) students and instructor agree on a theme for the project, 

(b) students and instructor determine the final outcome, 

(c) students and instructor structure the project, 

(d) instructor prepares students for information gathering, 

(e) students gather information, 

(f) instructor prepares students for compiling and analyzing data, 

(g) students compile and analyze data, 

(h) instructor prepares students for the culminating activity, 

(i) students present the final product, 

(j) students evaluate the product. 

Recently, new technology has made it possible to implement projects on 
the computer if students have the Internet access. For information about 
how this can be done see, Warschauer (1995) and Warschauer et al. 
( 2000 ). 

Research on Language Projects 

A review of research on language projects revealed that only two studies 
were conducted in this area in the last decade. In one of them, Hunter and 
Bagley (1995) explored the potential of the global telecommunication 
projects. Results indicated that such projects developed students’ literacy 
skills, their personal and interpersonal skills, as well as their global 
awareness. In the other study, Smithson (1995) found that the on-going 
assessment of writing through projects improved students’ writing. 

2.2.3 Portfolios 

Portfolios are purposeful collections of a student’s work which exhibit 
his/her performance in one or more areas (Arter et al., 1995; Barton and 
Coley, 1994; Graves, 2000). In language arts, there is a spreading 
emphasis on this format as an alternative type of assessment (Gomez, 
2000; Jones and Vanteirsburg, 1992; Newman and Smolen, 1993; Pierce 
and O’Malley, 1992). Many advantages have been claimed for this type of 
assessment. The first advantage is that this alternative links assessment to 
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teaching and learning (Hirvela and Pierson, 2000; Porter and Cleland, 
1995; Shackelford, 1996). The second advantage of this alternative is that 
it gives students a voice in assessment and helps them diagnose their own 
strengths and weaknesses (Courts and Mclnerney, 1993). The third 
advantage is that this alternative can be tailored to the student’s needs, 
interests, and abilities (Ediger, 2000). Additional advantages of this 
alternative are stated by Arter et al. (1995) in the following way: 

The perceived benefits [of portfolios as an assessment format] are 
that the collection of multiple samples of student work over time 
enables us to (a) get a broader, more in-depth look at what the 
students know and can do; (b) base assessment on more 
“authentic” work; (c) have a supplement or alternative to report 
cards and standardized tests; and (d) have a better way to 
communicate student progress to parents, (p. 2) 

However, as with all performance assessment formats, it is quite 
difficult to come up with consistent evaluations of different students’ 
portfolios (Dudley, 2001; Hewitt, 2001). Another problem with portfolio 
assessment is that it takes time to be carried out properly (Koretz, 1994; 
Ruskin-Mayher, 1999). In spite of these demerits, portfolio assessment is 
growing in use because its merits outweigh its demerits. 

Tannenbaum (1996) suggests that the following types of materials can 
be included in a portfolio: 

(a) audio-and videotaped recordings of readings or oral 
presentations, 

(b) writing samples such as dialogue journal entries and book 
reports, 

(c) writing assignments (drafts or final copies), 

(d) reading log entries, 

(e) conference or interview notes and anecdotal records, 

(f) checklists (by teacher, peers, or student), and 

(g) tests and quizzes. 

To gain multiple perspectives on students’ language development, 
Tannenbaum (1996) further suggests that students should include more 
than one type of materials in the portfolio. More specifically, Farr and 
Tone (1994) suggest that the best guides for selecting work to include in a 
language arts portfolio are these two questions: “What do these materials 
tell me about the student?” and “Will the information obtained from 
these materials add to what is already known?” However, May (1994) 
contends that teachers should let students decide what they want to 
include in a portfolio because this makes them feel they own their 
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portfolios and that this feeling of ownership leads to caring about 
portfolios and to greater effort and learning. 

Tenbrink (1999) suggests that using portfolios for assessing students’ 
performance requires the following: 

(a) deciding on the portfolio’s purpose, 

(b) deciding who will determine the portfolio’s content, 

(c) establishing criteria for determining what to include in the 
portfolio, 

(d) determining how the portfolio will be organized and how the 
entries will be presented, 

(e) determining when and how the portfolio will be evaluated, and 

(f) determining how the evaluations of the portfolio and its 
contents will be used. 

To be an effective assessment format, portfolios must be consistent with 
the goals of the curriculum and the teaching activities (Arter and Spandel, 
1992; Tenbrink, 1999). That is, they should focus on the same targets 
emphasized in the curriculum as well as the daily instruction activities. As 
Tenbrink (1999) puts it, “Portfolios can be a very powerful tool if they are 
fully integrated into the total instructional process, not just a tag-on at the 
end of instruction” (p. 332). 

To make portfolios more useful as an assessment format, some 
educators (e.g., Farr, 1994; Grace, 1992; Wiener and Cohen, 1997) suggest 
that the teacher should occasionally schedule and conduct portfolio 
conferences. Through such conferences, students share what they know 
and gain insights into how they operate as readers and writers. Although 
such conferences may take time, they are pivotal in making sure that 
portfolio assessment fulfills its potential (Ediger, 1999). In order to make 
such conferences time efficient, Farr (1994) suggests that the teacher 
should encourage students to prepare for them and to come up with 
personal appraisals of their own work. 

Since current technology allows for the storage of information in the 
form of text, graphics, sound, and video, many assessment specialists (e.g., 
Barret, 1994; Chang, 2001; Hetterscheidt et al., 1992; Wall, 2000) suggest 
that students should save their portfolios on a floppy disk or on a website. 
Such assessment specialists claim that this makes students’ portfolios 
available for review and judgment by others. Other advantages of 
electronic portfolios are stated by Lankes (1995) this way: 

The implementation of computer-based portfolios for student 
assessment is an exciting educational innovation. This method 
of assessment not only offers an authentic demonstration of 
accomplishments, but also allows students to take responsibility 
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for the work they have done. In turn, this motivates them to 
accomplish more in the future. A computer-based portfolio 
system offers many advantages for both the education and the 
business communities and should continue to be a popular 
assessment tool in the “information age.” (p. 3) 

Research on Language Portfolios 

A survey of recent research on portfolios revealed that many investigators 
used this format as a research tool for assessing students’ writing (e.g., 
Camp, 1993; Condon and Hamp-Lyons, 1993; Hamp-Lyons, 1996). A 
second body of research indicated that portfolio assessment, that was 
situated in the context of language teaching and learning, improved the 
quality and quantity of students’ writing (e.g., Horvath, 1997; Moening 
and Bhavnagri, 1996), enabled learning disabled students to diagnose 
their own strengths and weaknesses (e.g., Boerum, 2000; Holmes and 
Morrison, 1995), and had a positive effect on teachers’ understanding of 
assessment and on students’ understanding of themselves as learners and 
writers (Ponte, 2000; Tanner, 2000; Wolfe, 1996). A third body of 
research investigated teachers’ or students’ perceptions of portfolios after 
their involvement in portfolio assessment. In this respect, Lylis (1993) 
found that teachers felt that portfolio assessment helped them document 
students’ development as writers and offered them a greater potential in 
understanding and supporting their students’ literacy development. 
Additionally, Anselmo (1998) found that students, who assessed their own 
portfolios, felt that their motivation increased. 

2.2.4 Dialogue Journals 

Dialogue journals-where students write freely and regularly about their 
activities, experiences, and plans-can be a rich source of information 
about students’ reading and writing performance (Bello, 1997; Borich, 
2001; Peyton and Staton, 1993; Schwarzer, 2000). Such dialogues are also 
a powerful tool with which teachers can collect information on students’ 
reading and writing processes (Garcia, 1998; Graves, 2000). 

The advantages of using dialogue journals for both instructional and 
assessment purposes include individualizing language teaching, making 
students feel that their writing has a value, promoting students’ reflection 
and autonomous learning, increasing students’ confidence in their own 
ability to learn, helping the instructor adapt instruction to better meet 
students’ needs, providing a forum for sharing ideas and assessing 
students’ literacy skills, using writing and reading for genuine 
communication, and increasing opportunities for interaction between 
students and teachers (Bromley, 1993; Burniske, 1994; Cobine, 1995a; 
Courts and Mclnerney, 1993; Garcia, 1998; Garmon, 2000; Graves, 2000; 
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Smith, 2000). However, dialogue journals require a lot of time from the 
teacher to read and respond to student entries (Worthington, 1997). 

Peyton (1993) offers the following suggestions for responding to 
students’ dialogue writing: 

(a) commenting only on the content of the student’s entry, 

(b) asking open-ended questions and answering student questions, 

(c) requesting and giving clarification, and 

(d) offering opinions. 

However, Routman (2000) cautions that responding only to the content of 
the dialogue journals may lead students to get accustomed to sloppy writing 
and bad spelling as the norm for writing. 

Both Reid (1993) and Worthington (1997) agree that the dialogue 
journal partner does not have to be the teacher and that students can write 
journals to other students in the same class or in another class. They claim 
that this reduces the teacher’s workload and makes students feel 
comfortable in asking for advice about personal problems. In such a case, 
Worthington (ibid.) suggests that the teacher can put a box and a sign-in 
notebook in his/her office to help him/her monitor the journal exchanges 
between pairs. 

With access to computer networks, many educators and assessment 
specialists (e.g., Hackett, 1996; Knight, 1994; LeLoup and Ponterio, 1995; 
Peyton, 1993) suggest that students can keep electronic dialogue journals 
with the teacher or other students in different parts of the world. 

Research on Dialogue Journal Writing 

A review of recent dialogue journal studies indicated that keeping a 
dialogue journal improved students’ writing (e.g., Cook, 1993; Hannon, 
1999; Ho, 1992; Song, 1997; Worthington, 1997), and increased their self- 
confidence (e.g., Baudrand, 1992; Dyck, 1993; Hall, 1997). It is worth noting 
here that although dialogue journals were used in these studies as an 
instructional technique, the procedure of this technique actually involved 
an assessment stage at which teachers responded to the content of 
students’ entries. 

Regarding the effect of computer-mediated journals on students’ writing 
performance, the writer found that three studies were conducted in this area 
in the last ten years. In one of them, Ghaleb (1994) found that the quantity of 
writing in the networked class far exceeded that of the traditional class, and 
that the percentage of errors in the networked class dropped more than that 
of the traditional class. Based on these results, Ghaleb concluded that 
"computer-mediated communication . . . can provide a positive writing 
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environment for ESL students, and as such could be an alternative to the 
laborious and time-engulfing method of the traditional approach to teaching 
writing" (p. 2865). In the second study, MacArthur (1998) found that 
writing dialogue journals using the word processor had a strong positive 
effect on the writing of students with learning disabilities. In the third 
study, Gonzalez-Bueno and Perez (2000) found that electronic dialogue 
journals had a positive effect on the amount of language generated by 
learners of Spanish as a second language, and on their attitudes towards 
learning Spanish, but did not have a significant effect on lexical or 
grammatical accuracy. 

2.2.5 Story Retelling 

Story retelling is a highly popularized format of performance assessment 
(Kaiser, 1997; Pederson, 1995). It is an effective way to integrate oral and 
written language skills for both learning and assessment (May, 1994). 
Students who have just read or listened to a story can be asked to retell 
this story orally or in writing (Callison, 1998; Pierce and O’Malley 1992). 
This format can be also used for assessing students’ reading 
comprehension. As Kaiser (1997) puts it, “Story retelling can play an 
important role in performance-based assessment of reading 
comprehension” (p. 2). 

The advantages of this format as an instructional and assessment 
technique include allowing students to share the cultural heritage of other 
people; enriching students’ awareness of intonation and non-verbal 
communication; relieving students from the classroom routine; 
establishing a relaxed, happy relationship between the storyteller and 
listeners; allowing the teacher to assess students in a relatively non- 
threatening atmosphere; and allowing the students to assess one another 
(Grainger, 1995; Hines, 1995; Kaiser, 1997; Malkina, 1995; Stockdale, 
1995). 

Wilhelm and Wilhelm (1999) suggest that when choosing tales for 
retelling, language difficulty, content appropriateness, and instructional 
objectives should be considered. They also suggest that after story 
retelling, the teacher should encourage students to evaluate their own 
retellings (ibid.). 

Kaiser (1997) suggests that students need to be aware of the structural 
elements of a story before asking them to retell stories. She further 
suggests that this can be achieved through instruction and practice in 
story structure using a story map. However, Pederson (1995) suggests that 
storytelling lies within the storyteller and that storytellers must go beyond 
the rules and develop their own unique styles. 
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The story retelling techniques include oral or written presentations, 
role playing, and pantomiming (Biegler, 1998). Students can retell the 
story in whatever way they prefer (ibid.). 

During retelling, Grainger (1995) suggests that the student should 
maintain eye contact, use gestures that come naturally, vary his/her voice, 
and give different tones to different characters. She further suggests that 
teachers may divide the class into small groups so that more students can 
retell stories at one time. In such a case, audio and video recording can be 
used to help in assessing students’ performance. Antonacci (1993) suggests 
that the teacher can help the student during retelling by clearing up 
misconceptions. To force students to listen attentively to the stories which 
their classmates tell, Gibson (1999) suggests that students should know in 
advance that one of them will tell the story again. 

After retelling, Pederson (1995) suggests using the following activities 
for assessing students’ performance: 

(a) analyzing and comparing characters, 

(b) discussing topics taken from the story theme, 

(c) summarizing or paraphrasing the story, 

(d) writing an extension of the story, 

(e) dramatizing the story, and 

(f) drawing pictures of the characters. 

Tompkins and Hoskisson (1995) suggest that “teachers can assess both 
the process students use to retell the story and the quality of the products 
they produce” (p. 131). They further suggest that assessing “the process of 
developing interpretations is far more important than the quality of the 
product” (loc. cit.). 

Research on Story Retelling 

To date there has been no research on story retelling as an on-going 
assessment format. However, indirect support for the use of this format 
comes from several studies which used storytelling as an instructional 
technique. The results of these studies revealed that this technique 
improved (a) reading comprehension (e.g., Biegler, 1998; Trostle and 
Hicks, 1998), (b) narrative writing (e.g., Gerbracht, 1994), (c) oral skills 
(e.g., Cary, 1998), and (d) self-esteem (e.g., Carroll, 1999; Lie, 1994 ). 
Indirect support for the use of this format also comes from Brenner’s 
study (1997). In this study, she (Brenner) analyzed the elements of story 
structure used in written and oral retellings. Results indicated that 
written and oral retellings were of significant value in assessing students’ 
comprehension. Based on these results, she concluded that “monitoring 
students’ use of story structure elements provides a holistic method for 
the assessment of comprehension” (p. 4599). 
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2.2.6 Oral Reading 

Listening to students reading aloud from an appropriate text can provide 
teachers with information on how students handle the cueing systems of 
language (semantic, syntactic, and phonemic) and on how they process 
information in their heads and in the text to construct meaning (Farrell, 
1993; Manning and Manning, 1996). During oral reading the teacher can 
code students’ miscues (Wallace, 1992). Through an analysis of such 
miscues, the teacher becomes aware of each student’s reading strategies as 
well as his/her reading difficulties (May, 1994; Pierce and O’Malley, 1992; 
Pike and Salend, 1995). Miscues are often analyzed in terms of their 
syntactic and semantic acceptability. The following four questions are 
often asked in this procedure (Watson and Henson, 1991): 

(a) Is the sentence, as finally read by the student, syntactically 
acceptable within the context? 

(b) Is the sentence, as finally read by the student, semantically 
acceptable within the entire context? 

(c) Does the sentence, as finally read by the student, change the 
meaning of the text? (This question is coded only if questions 1 
and 2 are coded yes.) 

(d) How much does the miscue look like the text item? 

Once the miscue analysis is completed, the teacher should use the 
individual conferences to inform each student of his/her strengths and 
weaknesses and to suggest possible remedies for problems (May, 1994). 

During oral reading, the teacher can also observe a student’s 
performance by using anecdotal records (Rhodes and Nathenson-Mejia, 
1992). The open-ended nature of these records allows the teacher to 
describe students’ performance, to integrate present observations with 
other available information, and to identify instructional approaches that 
may be appropriate (ibid.). 

Since there is insufficient time for each student in a large class to 
present his/her oral reading to the teacher, students can record their oral 
readings and submit the tapes to the teacher to analyze and evaluate them 
at leisure (Tannenbaum, 1996). 

Research on Oral Reading 

A survey of research on oral reading revealed that three studies were 
conducted in this area in the last ten years. One of them (Kitao and Kitao, 
1996) used oral reading as a research tool for testing EFL students’ 
speaking performance. The second study addressed oral reading as an 
assessment and instructional technique. In this study, Adamson (1998) 
found that the use of oral reading as an on-going assessment and 
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instructional technique provided the classroom teacher and parents with 
critical information about students’ literacy development. The third study 
addressed oral reading as an instructional technique. In this study, Atyah 
(2000) found that oral reading improved EFL students oral performance. 

2.2.7 Group Discussions 

Group discussions are a powerful format with which teachers can collect 
information on students’ oral and literacy performance (Graves, 2000; 
Butler and Stevens, 1997). This format engages students in discussing what 
they have just read, listened to, or written, or any topics of interest to 
them. The advantages of this format as an instructional and assessment 
technique include encouraging students to express their own opinions, 
allowing students to hear different points of view, increasing students’ 
involvement in the learning process, developing students’ critical thinking 
skills, allowing the teacher to assess students’ performance in a relatively 
non-threatening atmosphere, and raising students’ motivation level 
(Greenleaf et al., 1997; Kahler, 1993; McNeill and Payne, 1996). However, 
the teacher may not have the time to observe all discussion groups in large 
classes (Auerbach, 1994). To overcome this difficulty, Kahler (1993) 
suggests that the teacher should videotape discussion sessions for later 
analysis and evaluation. 

May (1994) notes that the organization of groups and the choice of 
discussion topics play important roles in promoting successful assessment 
with group discussions. He further notes that students should be grouped 
in a way to have something to offer each other, and that the discussion topics 
have to be of a problematic nature and relevant to the needs and interests of 
the students (ibid.). 

Kahler (1993) suggests that the main role of the teacher during group 
discussions is to act as language consultant to resolve communicative 
blocks, and to make notes of students’ strengths and weaknesses. The 
teacher can also use observational checklists for recording data about 
students’ performance (Secord et al., 1994). 

To promote group discussions, Zoya and Morse (2002) suggest that the 
teacher should: 

(a) choose an interesting topic, 

(b) give students some materials on the topic and time limits to 
read and discuss, 

(c) praise every student for sharing any ideas, 

(d) let the students organize groups according to their friendship, 
and 

(e) invite specialists to participate in group discussions. 
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Nowadays, audio and video conferencing programs, such as CUSeeMe 
and MS NetMeeting are available options for engaging students in voice 
conversation. Through such computer programs, students can talk 
directly to their key pals in any place of the world. They can also see and 
be seen by the key pals they are addressing. Meanwhile, teachers can 
observe their discussions and progress, and make comments to individual 
students (Higgins, 1993; Kamhi-Stein, 2000; LeLoup and Ponterio, 2000; 
Muller-Hartmann, 2000; Sussex and White, 1996). 

Research on Group Discussions 

A survey of recent research on group discussions revealed that no studies 
used this format as an on-going assessment tool. However, indirect 
support for the use of this format comes from four studies that used group 
discussions as an instructional technique. These studies indicated that 
group discussions served to develop students’ literacy (Troyer, 1992); 
enriched students’ literacy understandings (Allen, 1994); encouraged 
students to introduce themes relevant to their age, interests, and personal 
circumstances and to develop these themes according to their own frame 
of reference (Mccormack, 1995); and improved students’ overall 
performance both at the individual and group levels (Mintz, 2001). 

2.2.8 Role Playing 

Role playing can be used not only as an activity to help students improve 
their language performance, but also as an assessment format to help the 
teacher assess students’ language performance (Davies et al., 1999; 
Tannenbaum, 1996). 

The advantages of role playing as an instructional and assessment 
technique include developing students’ verbal and non-verbal 
communication skills, increasing students’ motivation to learn, promoting 
students’ self-confidence, integrating language skills, developing students’ 
social skills, allowing the students to know and assess one another, and 
allowing the teacher to know and assess students in a relatively non- 
threatening setting (Haozhang, 1997; Krish, 2001; Maxwell, 1997; 
Tompkins, 1998). 

Before role playing, the students are given fictitious names to 
encourage them to act out the roles assigned to them (Tompkins, 1998). 
Additionally, McNamara (1996) suggests that each student should be 
given a card on which there are a few sentences describing what kind of a 
person he or she is. However, Kaplan (1997) argues against role plays that 
focus solely on role cards as they do not capture the spontaneous, real-life 
flow of conversation. 
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During role playing, the assessor, usually the teacher, can take a minor 
role in order to be able to control the role play (Tompkins and Hoskisson, 
1995). He/she can also support or guide students to perform their roles 
(ibid.). This “scaffolded assessment,” as Barnes (1999) notes, has a 
“learning potential” (p. 255). However, Weir (1993) claims that role 
playing will be more successful if it is done with small groups with the 
assessor as observer. He further states that if the assessor “is not involved 
in the interaction he has more time to consider such factors as 
pronunciation and intonation” (p. 62). 

At the end of role playing, Krish (2001) suggests that the teacher 
should get feedback from the role players on their participation during 
the preparation stage and the presentation stage. 

Since there is insufficient time for each group in a large class to present 
their role plays to the whole class, Haozhang (1997) suggests that each 
group should record their role play and submit the tape signed with their 
names to the teacher for assessment. 

To make role playing more effective as an instructional and assessment 
technique, Burns and Gentry (1998) suggest that teachers should choose 
role plays that match the language level of the students. 

To capture students’ interest, Al-Sadat and Afifi (1997) suggest that 
role-playing “must be varied in content, style, and technique” (p. 45). 
They add that role plays may be “comic, sarcastic, persuasive, or 
narrative” (loc. cit.). 

Research on Role Playing 

A survey of recent research on role playing indicated that only one study 
(Kormos, 1999) used this format as a research tool for assessing students’ 
speaking performance. Furthermore, two other studies investigated 
students’ perceptions of role playing as an instructional and assessment 
technique. In one of these studies, Kaplan (1997) found that students 
learning French as a foreign language felt that role playing boosted their 
confidence in speaking French. In the other study, Krish (2001) found that 
EFL students felt that role playing improved their English and developed 
their confidence to take part in this activity in the future. 

2.2.9 Teacher-Student Conferences 

Teacher-student conferences are another format for assessing students’ 
language performance (Ediger, 1999; Newkirk, 1995). Teachers often hold 
such conferences to talk with students about their work, to help them 
solve a problem related to what they are learning, and to assess their 
language performance (Fisher, 1995). 
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The advantages of teacher-student conferences as an instructional and 
assessment technique include allowing teachers to determine students’ 
strengths and weaknesses in language performance; providing an avenue 
for students to talk about their real problems with language; allowing the 
teacher to discover students’ needs, interests, and attitudes; and 
integrating language skills (Mclver and Wolf, 1998; Patthey and Ferris, 
1997; Sythes, 1999). 

Teacher-student conferences may be formal or informal (Fisher, 1995). 
They may be also held with individuals or groups (May, 1994). Individual 
conferences are superior to group conferences in allowing the teacher to 
diagnose the strengths and weaknesses of each student (ibid.). However, it 
may be difficult to hold individual conferences in large classes (Tarone 
and Yule, 1996). 

During conferring with the student, Fisher (1995) suggests that the 
teacher should fill in a conference form. In such a form he/she should 
record the date, the conference topic, the students’ strengths and 
weaknesses, and what the student will do to overcome his/her learning 
difficulties (ibid.). Furthermore, Tompkins and Hoskisson (1995) suggest 
that during the conference, the teacher’s role should be just a listener or 
guide as this role allows him/her to know a great deal about students and 
their learning. However, Hansen (1992, p. 100; cited in May, 1994, p. 397) 
suggests that the teacher should ask the following questions during the 
conference: 

(a) What have you learned recently in writing? 

(b) What would you like to learn next to become a better writer? 

(c) How do you intend to do that? 

(d) What have you learned recently in reading? 

(e) What would you like to learn next to become a better reader? 

(f) How do you intend to do that? 

After the conference, the teacher should keep the conference form- 
filled during the conference— in a folder along with other evaluation forms 
to help him/her keep track of the student’s progress in language 
performance (Fisher, 1995). 

With access to modern technology, some educators (e.g., Freitas and 
Ramos, 1998; Marsh, 1997) suggest that teacher-student conferences can 
be mediated through the computer. 

Research on Teacher-Student Conferences 

A survey of recent research on teacher-student conferences revealed that 
whereas several studies analyzed students’ and teachers’ behaviors during 
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such conferences (e.g., Boreen, 1995; Boudreaux, 1998; Forsyth, 1996; 
Gill, 2000; Keebleer, 1995; Nickel, 1997), no studies investigated the effect 
of this format as an on-going assessment technique on students’ language 
performance. 

2.2.10 Verbal Reports 

Verbal reports refer to learners’ descriptions of what they do while 
performing a language task or immediately after completing it. Such 
descriptions develop students’ metacognitive awareness and make 
teachers aware of their students’ learning processes (Anderson, 1999; 
Matsumoto, 1993). Such an awareness can help students make conscious 
decisions about what they can do to improve their learning (Benson, 2001; 
Ericsson and Simon, 1993). It can also help teachers assist students who 
need improvement in their learning processes (Chamot and Rubin, 1994; 
May, 1994). However, students may change their actual learning 
processes when teachers ask them to report on these processes (O’Malley 
and Chamot, 1995). 

Verbal reports may be introspective or retrospective. Introspective 
reports are collected as the student is engaged in the task. This type of 
reports has been criticized for interfering with the processes of task 
performance (Gass and Mackey, 2000). Retrospective reports are 
collected after the student completes the task. This type of reports has 
been criticized because students may forget or inaccurately recall the 
mental processes they employed while doing the task (Smagorinsky, 1995). 

To help students produce useful and accurate verbal reports, Anderson 
and Vandergrift (1996) suggest that the teacher should: 

(a) provide training for students in reporting their learning 
processes, 

(b) elicit verbal reports as close to the students’ completion of the 
task as possible, or even better, during the language task, 

(c) provide students with some contextual information to help 
them remember the strategies used during doing the task if 
the report is retrospective, 

(d) videotape students while doing the task, and 

(e) allow students to use either LI or L2 to produce their verbal 
reports. 

There are different opinions with respect to the validity and reliability 
of verbal reports. However, many assessment specialists (Alderson, 2000; 
Storey, 1997; Wu, 1998) agree that verbal reports can be valuable sources 
of information about students’ cognitive processes when they are elicited 
with care and interpreted with full understanding of the conditions under 
which they were obtained. 
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Research on Verbal Reports 

A survey of research on introspective and retrospective verbal reports 
indicated that several studies used this format as a research tool for 
investigating the processes test-takers employ in responding to test tasks 
(e.g., Gibson, 1997; Storey, 1997; Wijgh, 1996; Wu, 1998), and for 
exploring students’ learning processes (e.g., El-Mortaji, 2001; Feng, 2001; 
Kasper, 1997; Lynch, 1997; Robbins, 1996; Sens, 1993). 

In addition to the above studies, two other studies were conducted in 
this area. In one of them, Allan (1995) investigated whether students can 
effectively report their thoughts. Results indicated that many students 
were not highly verbal and found it difficult to report their thought 
processes. In the other study, Anderson and Vandergrift (1996) 
investigated the effect of verbal reports as an on-going assessment tool on 
students’ awareness of their reading processes and their reading 
performance. Results indicated that students’ use of verbal reports as a 
classroom activity helped them become more aware of their reading 
strategies and improved their reading performance. 

Additional formats/instruments for evaluating students’ performance include 
essay writing, dramatization, demonstrations, experiments, etc. 

2.3 Criteria for Selecting Performance Assessment 
Formats 

In selecting from the previously-mentioned formats, four general 
considerations should be kept in mind. First, selection should be guided 
primarily by its match to the teaching/learning targets as a mismatch between 
the assessment format and these targets will lower the validity of the results 
(Nitko, 2001). The second consideration in selecting among performance 
assessment formats is the area of assessment. Some of the previously- 
mentioned formats are compatible with reading and writing while others are 
compatible with listening and speaking; some are suitable for assessing language 
products while others are suitable for assessing learning processes. The third 
consideration in selecting among performance assessment formats is that no 
single format is sufficient to evaluate a student’s performance (Shepard, 2000). 
In other words, multiple assessment formats are necessary to provide a more 
complete picture of a student’s performance. The final consideration in 
determining the specific assessment format is that performance is best assessed 
if the selected format is used as a teaching or learning technique rather than as 
a formal or informal test (Cheng, 2000; McLaughlin and Warran, 1995; 
O’Malley, 1996). 
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Chapter Three 

Alternative Groupings for Performance 

Assessment 



3.0 Introduction 

Many educators and assessment specialists (e.g., Barnes, 1999; Campbell et al., 
2000; O'Neil, 1992; Santos, 1997) claim that students themselves need to be 
involved in the process of assessing their own performance. This can be done 
through self-assessment, peer-assessment, and collaborative group assessment. 
Each of these alternatives is discussed below. 

3.1 Self-Assessment 

Self-assessment has been offered as one of the alternatives to teacher 
assessment. Kramp and Humphreys (1995) define this alternative as “a 
complex, multidimentional activity in which students observe and judge 
their own performances in ways that influence and inform learning and 
performance” (p. 10). Many educators claim that this type of assessment has 
several advantages. The first of these advantages is that it promotes 
students' autonomy (Ekbatani, 2000; Graham, 1997; Williams and Burden, 
1997; Yancey, 1998). The second advantage is that the involvement of 
students in assessing their own learning improves their metacognition which 
can, in turn, lead to better thinking and better learning (Andrade, 1999; 
O'Malley and Pierce, 1996; Steadman and Svinicki, 1998). The third 
advantage of this type of assessment is that it enhances students' motivation 
which can, in turn, increase their involvement in learning and thinking 
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(Angelo, 1995; Coombe and Kinney, 1999; Todd, 2002). The fourth 
advantage of this type of assessment is that it fosters students' self-esteem 
and self-confidence, which can, in turn, encourage them to see the gaps in 
their own performance and to quickly begin filling these gaps (Smolen et al., 
1995; Statman, 1993; Wood, 1993). The fifth and final advantage of self- 
assessment is that it alleviates the teacher’s assessment burden (Cram, 1995). 

However, opponents of self-assessment claim that this type of assessment 
is an unreliable measure of learning and thinking. They further claim that 
the unreliability of this type of assessment is due to two main reasons. The 
first reason is that students may under- or over-estimate their own 
performance (McNamara and Deane, 1995). The second reason is that 
students can cheat when they assess their own performance (Gardner and 
Miller, 1999). Another disadvantage of self-assessment is that a few students 
may engage in it (Cram, 1993). 

Gipps (1994) suggests that learners need sustained training in ways of self- 
assessment to become competent assessors of their own performance. In 
support of this suggestion, Marteski (1998) found that instruction in self- 
rating criteria had a positive effect on students’ ability to assess their 
writing. 

Barnes (1999) makes the point that questions can encourage learners to 
evaluate their own performance in a more structured way. She adds that 
these questions should be generic such as “How are you doing and what do 
you need to do to improve?” Answers to such questions can help the learner 
decide what is exactly needed (ibid.). She (Barnes, 1996) also suggests that 
self-assessment can be aided through the use of a logbook or a course guide. 

Arter and Spandle (1992) suggest asking students the following questions 
to encourage them to engage in self-assessment: 

(a) What is the process you went through to complete this assignment? 

(b) Where did you get ideas? 

(c) What are the problems you encountered? 

(d) What revision strategies did you use? 

(e) How does this activity relate to what you have learned before? 

(f) What are the strengths of your work? and 

(g) What still makes you uneasy? 

Anderson (2001) suggests that teachers can help students evaluate their 
strategy use by asking them to respond thoughtfully to the following 
questions: 

(a) What are you trying to accomplish? 

(b) What strategies are you using? 

(c) How well are you using them? and 
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(d) What else could you do? 

Furthermore, a number of instruments have been developed for 
encouraging students to engage in assessing their own learning processes and 
products. These instruments include K-W-L charts, learning logs, and self- 
assessment checklists. Each of these instruments is briefly described below. 

(1) K-W-L Charts 

The K-W-L chart (what I “Know’Vwhat I “Want” to know/what I’ve 
“Learned”) is one form of self-assessment instruments (O’Malley and 
Chamot, 1995). The use of this chart improves students’ learning 
strategies, keeps learners focused and interested during learning, and 
gives them a sense of accomplishment when they fill in the L column 
after learning (Shepard, 2000). 

Tannenbaum (1996) suggests that this chart can be used as a class 
activity or on an individual basis before and/or after learning, and that 
this chart can be completed in the first language for students with 
limited English proficiency. 

Research on K-W-L Charts 

A survey of recent research on the K-W-L chart revealed that only one 
study was conducted in this area. In this study, Burns (1995) found that 
this chart had a significant effect on fifth-grade students’ reading 
comprehension. 

(2) Learning Logs 

Learning logs are a self-assessment tool which students keep about what 
they are learning, where they feel they are making progress, and what 
they plan to do to continue making progress (Carlisle, 2000; Lee, 1997; 
Pike and Salend, 1995; Yung, 1995). At regular intervals, the students 
reflect on and analyze what they have written in their logs to diagnose 
their own strengths and weaknesses and to suggest possible remedies for 
problems (Castillo and Hillman, 1997; Cobine, 1995b). Additional 
advantages of this format as a learning and assessment technique are 
(Angelo and Cross, 1993; Commander and Smith, 1996; Conrad, 1995; 
Kerka, 1996): 

(a) encouraging students to become self reflective, 

(b) promoting autonomous learning, 

(c) fostering students’ self-confidence, and 

(d) providing the teacher with assessable data on students’ 
metacognitive skills, and with valuable suggestions for 
improving students’ performance. 
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However, learning logs require time and effort from students and 
teachers (Angelo and Cross, 1993). Moreover, unless a continuing 
attempt is made to focus on strengths, this format can leave students 
demoralized from paying too much attention to their weaknesses and 
failures (ibid.). 

McNamara and Deane (1995) propose many activities that students 
might describe in their logs. These activities include listening to the 
radio, watching TV, speaking and writing to others, and reading 
newspapers. They further state that for each experience, students should 
record the date, the activity, the amount of time engaged in the use of 
English, the ease or difficulty of the activity, and the reasons for the ease 
or difficulty of this activity (ibid.). Cranton (1994) suggests that the 
learner can use one side of a page for the description of his/her activities 
and the other for thoughts and feelings stimulated by this description. 

Since students may find it difficult to know what to write in their logs, 
Walden (1995) suggests that teachers should give them specific guiding 
questions such as “What did you learn today and how will you apply that 
learning?” 

Paterson (1995) suggests that learning logs should be shared with the 
teacher. In such a case, the teacher should not grade them for writing 
style, grammar, or content, but they can be considered as part of the 
overall assessment (ibid.). 

Perham (1992) and Perl (1994) agree that learning logs can be shared 
with other students in the class. They further suggest using a loose-leaf 
notebook-accessible to the whole class-in which learners can reflect on 
what they learn and read other students’ reflections. 

Research on Learning Logs 

A review of recent research on learning logs revealed that studies 
conducted in this area were varied as briefly shown below. 

Holt (1994) found that six of the ten students who kept learning logs 
did not find this format helpful. In light of this result, Holt concluded 
that either the guiding questions those students were given did not 
motivate reflection or they did not know how to write reflectively. 

Matsumoto (1996) found that learning logs improved students’ 
reflection. 

Demolli (1997) found that learning logs along with group discussions 
increased students’ abilities to use critical thinking skills. 
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Saunders et al. (1999) investigated the effects of literature logs, 
instructional conversations, and literature logs plus instructional 
conversations on ESL students’ story comprehension. Results indicated 
that students in the literature logs group and literature logs plus 
instructional conversations group scored significantly higher than those 
in the conversations group on story comprehension. 

Vann (1999) investigated the effects of students’ daily learning logs as 
a means of assessing the progress of advanced ESL students at the 
college level. Results indicated that this technique enhanced both 
teaching and learning. 

Halbach (2000) found that learning logs revealed many differences 
between successful and less successful students with respect to their 
learning strategies. 

(3) Self-Assessment Checklists 

A checklist consists of a list of specific behaviors and a place for checking 
whether each is present or absent (Tenbrink, 1999). Through the use of 
checklists students can evaluate their own learning processes or products 
(Angelo and Cross, 1993; Burt and Keenan, 1995; Harris et al., 1996). 
Such checklists can be developed by the teacher or the students 
themselves through classroom discussions (Meisles, 1993). Moreover, 
many examples of checklists are nowadays available for students to use 
for self-assessing their own learning processes (e.g., Oxford’s SAS, 1993) 
and products (e.g., Robbins’ Effective Communication Self-Evaluation, 
1992). These checklists help students diagnose their own strengths and 
weaknesses, and help teachers adapt their teaching strategies to suit 
students’ levels and learning style preferences (Tenbrink, 1999). 
However, such checklists often focus on bits and pieces of students’ 
performance (ibid.). Furthermore, the preparation of checklists is rather 
time-consuming (Angelo and Cross, 1993). 

Research on Self-Assessment Checklists 

A survey of recent research on self-assessment checklists revealed that 
only one study was conducted in this area. In this study, Allan (1995) 
found that ready-made checklists risked skewing students’ responses to 
those the checklist writer had thought of. 

In addition to the previously mentioned instruments, other performance 
assessment formats (e.g., portfolios, dialogue journals) provide opportunities 
for self-assessment. 

Additional Research on Self-Assessment 
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In addition to the empirical studies conducted in the areas of learning logs and 
self-assessment checklists, many other studies were conducted on self- 
assessment in the last ten years. These studies fall into two broad categories: 
(1) investigating students’ ability to assess themselves and/or the factors that 
affect this ability, and (2) investigating the effects of self-assessment on 
student motivation and language performance. The first category includes six 
studies and two review articles. In one of the six studies, Moritz (1995) found 
that self-assessment was influenced by many factors such as language 
learning background, experience, and self-esteem. She (ibid.) concluded that 
“it seems unreasonable to employ self-assessment as a measurement tool in 
any situation which entails a comparison of students’ abilities. It may, 
however, be a useful formative learning device, that is, one used throughout 
the course of a learning program, for feedback to both learners and teachers 
about the learners’ progress, strengths, and weaknesses” (p. 2592). In the 
second study, Thomson (1995) found that learners were capable of carrying 
out self-assessment, but noted some variations in the levels of their self- 
ratings according to gender and ethnic background. In the third study, 
Graham (1997) found that effective language learners seemed willing and 
able to assess their own progress. In the fourth study, Shameem (1998) found 
that Indo-Fijians self-reported their oral Fiji Hindi ability at a level higher 
than their judged level of performance. In the fifth study, Shoemaker (1998) 
found that fourth-grade students with special education needs provided 
evidence of their ability to engage in self-assessment of literacy learning 
when they were asked to do so, but their self-assessments tended to reflect 
surface elements of reading and writing rather than reflections of strategic 
thinking. In the sixth study, Kruger and Dunning (1999) found that learners 
whose skills or knowledge bases were weak in a particular area tended to 
overestimate their ability in this area. 

In one of the two reviews undertaken in this area, Cram (1995) found that 
the accuracy of self-assessment varied according to several factors, including 
the type of assessment, language proficiency, academic record and degree of 
training; and that students’ willingness and ability to engage in self- 
assessment practices increased with training. She (ibid.) recommended that 
self-assessment can work best in a supportive environment in which 
“teachers would place high value on independent thought and action; [and] 
learners’ opinions would be accepted non-judgmentally” (p. 295). In the 
other review, Oscarson (1997) concluded that learners are capable of 
assessing their own language proficiency under appropriate conditions. 

The second category includes only two studies. In one of these studies, 
Smolen et al. (1995) found that self-assessment developed students’ self- 
awareness and self-confidence and improved the quantity of their writing. In 
the other study, Diaz (1999) investigated the effects of self-assessment on 
student motivation and second language proficiency. Results indicated that 
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self-assessment helped to improve students’ motivation as well as their oral 
and written proficiency in the target language. 

3.2 Peer- Assessment 

Many performance assessment specialists (e.g., Johnson, 1998; Norris, 1998; 
Van-Daalen, 1999) advocate the use of peer-assessment as an alternative to 
teacher assessment. The advantages of this alternative as a learning and 
assessment technique include (O’Donnell, 1999; King, 1998; Topping and 
Ehly, 1998): 

(a) helping students learn from each other, 

(b) developing students' sense of responsibility for their fellows’ progress, 

(c) reinforcing students’ self-esteem and self-confidence, and 

(d) saving the teacher's time. 

Hughes and Large (1993) suggest that learners need training in 
performance assessment before asking them to assess each other. King (1998) 
further suggests that students should be given assessment forms to use while 
assessing each other. Furthermore, Mansour and Mansour (1998) propose that 
at the end of peer assessment the teacher should assess students’ assessments. 

Anderson and Vandergrift (1996) suggest that peers can be involved in 
assessing the strategies they employ while doing a language task. 

Research on Peer-Assessment 

A survey of recent research on peer-assessment revealed that studies 
conducted in this area focused on students’ perceptions of peer-assessment, 
the effect of peer vs. teacher assessment on students’ writing performance, and 
the effect of peer- vs. self-assessment on students’ writing performance. 

With respect to students’ perceptions of peer assessment, Qiyi (1993) 
discovered that Chinese EFL students who used peer-assessment found 
themselves more interested in the writing class than before and thought that 
peer-assessment helped them make greater gains in writing quality than did 
the teacher evaluation. Similarly, Huang (1995) investigated university 
students’ perceptions of peer-assessment in an EFL writing class. Results 
indicated that students had a positive perception of how they and their peers 
performed in peer-assessment sessions. 

With respect to the effect of peer vs. teacher assessment, only one study was 
conducted in this area in the last ten years. In this study, Richer (1993) 
investigated the effect of peer directed vs. teacher based assessment on first 
year college students' writing proficiency. Results showed that there was a 
significant difference in writing proficiency in favor of the peer-assessment 
group. 
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With respect to the effect of peer- vs. self-assessment, only two studies were 
conducted in this area in the last ten years. In one of these studies, Mooko 
(1996) investigated the effect of guided peer-assessment vs. guided self- 
assessment on the quality of ESL students’ compositions. Results revealed that 
guided peer-assessment was superior to guided self-assessment in enabling 
students to refine the opening (introduction) and closing statements 
(conclusion) of their compositions, and in assisting in the reduction of micro- 
level errors. Results also revealed that self-assessment was more effective than 
guided peer-assessment in improving composition content. In the other study, 
Al-Hazmi (1998) investigated the effect of peer-assessment vs. self-assessment 
on the quality of word processed ESL compositions. Results indicated that both 
peer-assessment and self-assessment improved the quality of EFL students’ 
writing. However, subjects in the peer-assessment group showed slightly more 
improvement between drafts with respect to mechanics, grammar, vocabulary, 
organization, and content than those in the self-assessment group. The self- 
assessment subjects, nonetheless, recorded slightly higher scores in their final 
drafts for mechanics, language use, vocabulary, organization, content, and 
length. 

3.3 Group Assessment 

Group assessment is a further extension of peer-assessment. This type of 
assessment provides students with a genuine audience whose response is 
immediate (Barnes, 1999; Berridge and Muzamhindo, 1998). Moreover, 
through involvement in group assessment students become more critical of 
their own work (Graham, 1997). Additional advantages of collaborative group 
assessment are (Stahl, 1994; Webb, 1995): 

(a) developing students' sense of responsibility, 

(b) helping weak students to learn from their colleagues, 

(c) developing students’ social skills, and 

(d) reducing the assessment load of the teacher. 

However, compared to self- and peer-assessment, group assessment requires 
more preparation from the teacher to form groups. Additionally, conflict is 
more likely to arise among group members (Imel, 1992). Therefore, the teacher 
should move among groups to observe group members while assessing their 
own performance, and to resolve the conflicts that may arise among them. 

Research on Group Assessment 

A survey of recent research in the area of group assessment revealed that 
only one study was conducted in this area in the last ten years. In this study, 
Lejk, Wyvill, and Farrow (1999) found that that low-ability students 
performed better when having their work done and assessed in mixed-ability 
groups and that high-ability students obtained lower grades in heterogeneous 
groups than in homogeneous groups. 
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To sum up this chapter, the writer claims that we cannot assume that students 
at all levels are capable of assessing their own performance in English as a foreign 
language. Nor can we assume that teachers have the time to continuously assess 
all students’ performance in large classes. Therefore, both teachers and students 
need to be involved in the process of assessment. 



Chapter Four 

Performance Assessment 
via Computers 



4.1 Theoretical Background 

In response to the widespread use of computers at schools, homes and 
workshops, many educators (e.g., Alderson, 1999, 2000; Bernhardt, 2000; 
Darling-Hammond et al., 1995; Gruba and Corbel, 1997) call for the 
administration of performance tests via computers. Such educators claim that 
advances in multimedia and web technologies offer the potential for designing 
and developing performance tests that are more interactional than their paper- 
and-pencil counterparts. They also claim that the computer lends authenticity 
to assessment tasks because it is connected to students’ lives and to their 
learning experiences. 

4.2 Research on Performance Assessment via Computers 

The introduction of computer administered tests raised a concern about the 
equivalence of performance yielded via computers versus paper-and-pencil 
tests. As a result of this concern many studies investigated the effect of 
computer versus paper-and-pencil tests on students’ performance. In this 
respect, Mead and Drasgow (1993) reported on a meta-analysis of 29 studies 
that computerized tests were slightly harder than paper-and-pencil tests. They 
concluded that the results of their meta-analysis “provide strong support for 
the conclusion that there is no medium effect for carefully constructed power 
tests” (p. 457). 
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In a more recent review, Sawaki (1999) also found that there was little 
consensus in research findings regarding whether test takers either performed 
better or preferred computer-based as opposed to paper-and-pencil tests of 
reading. However, Russell and Haney (1997) found that writing performance 
on the computer was substantially better for students accustomed to writing on 
computers than that written by hand. 



Chapter Five 

Reliability and Validity of Performance 

Assessment 

5.1 Theoretical Background 

As opposed to standardized forms of testing, performance-based assessment does 
not have clear-cut right or wrong answers. However, advocates of performance 
assessment claim that there are ways to make performance assessment valid and 
reliable. The first way is to use performance assessment rubrics (Boyles, 1998; 
Linn, 1993). Such rubrics, as Elliot (1995) suggests, should be developed jointly by 
the teacher and students. In support of developing the assessment rubrics in this 
way, Graves (2000) found that by assessing students’ performance with rubrics 
created jointly by the teacher and students, there was “much less cause for 
complaint, whining, accusations of unfairness, or claims of ignorance” (p. 229). 
Furthermore, allowing “students to assist in the creation of rubrics may be a 
good learning experience for them” (Brualdi, 1998, p. 3). Additional advantages 
of the development of assessment rubrics with students are: 

(a) allowing students to know how their own performance will be evaluated 
and what is expected from them, and 

(b) promoting students’ awareness of the criteria they should use in self- 
assessing their own performance. 

The assessor can use either holistic or analytic assessment rubrics for the 
evaluation of students’ performance. However, many performance assessment 
specialists (e.g., Hyslop, 1996; Moss, 1997; Pierce and O’Malley, 1992; Wiig, 
2000) strongly advocate the use of the holistic rubrics for the assessment of 
students’ performance. Such assessment specialists contend that these rubrics 
focus on the communicative nature of the language. As Pierce and O’Malley 
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(1992) put it, “Scoring criteria should be holistic with a focus on the student’s 
ability to receive and convey meaning. Holistic scoring procedures evaluate 
performance as a whole rather than by its separate linguistic or grammatical 
features” (p. 4). However, the use of such rubrics may result in wide 
discrepancies among raters (Davies et al., 1999). Therefore, the second way to 
make performance assessment valid and reliable is to have a student’s 
performance assessed by two or more raters (McNamara, 1997b; Mehrens, 
1992). These raters should agree upon the assessment criteria and obtain 
similar scores on some performance samples prior to scoring (Ruth, 1998). 

The third way to make performance assessment valid and reliable is to use 
multiple assessment formats for assessing the same learning objective. Shepard 
(2000) expresses this idea in the following way: 

Variety in assessment techniques is a virtue, not just because 
different learning goals are amenable to assessment by different 
devices, but because the mode of assessment interacts in complex 
ways with the very nature of what is being assessed. For example, the 
ability to retell a story after reading it might be fundamentally a 
different learning construct than being able to answer 
comprehension questions about the story: both might be important 
instructionally. Therefore, even for the same learning objective, 
there are compelling reasons to assess in more than one way, both to 
ensure sound measurement and to support development of flexible 
and robust understandings, (p. 48) 

It is worth noting here that some performance assessment specialists (e.g., 
Bachman and Palmer, 1997; Kunnan, 1999; Moss, 1994, 1996) argue against a 
reliance on the traditional, fragmented approach to reliability and validity as 
sole or best means of achieving fairness and equity in evaluating students’ 
performance. Kunnan (1999), for example, gives primacy to test fairness and 
argues that “if a test is not fair there is little or no value in it being valid and 
reliable or even authentic and interactive” (p 10). He further proposes that 
fairness in language assessment can be achieved through the following: 

(a) equity in constructing the test in terms of culture, academic discipline, 
gender, etc., 

(b) equity in treatment in the testing process (e.g., equal testing conditions, 
equal opportunity to be familiar with testing formats and materials), and 

(c) equity in the social consequences of the test (e.g., access to university, 
promotion). 

Beyond the concern with traditional reliability and validity, Bachman and 
Palmer (1997) also propose that the most important consideration in designing 
a performance test is its usefulness. They add that the usefulness of a language 
test can be defined in terms of the following six qualities: 
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(a) consistency of measurement, 

(b) meaningfulness and appropriateness of the interpretations that we make on 
the basis of the test scores, 

(c) authenticity of the test tasks — that is, the correspondence between the 
characteristics of the target language use tasks and those of the test tasks, 

(d) interactiveness of the test tasks — that is, the capacity of the test tasks to 
engage the test taker in performing cognitive and metacognitive aspects of 
language, 

(e) impact of the test on the society and educational system, and on the 
individuals within this system, and 

(f) availability of the resources required for the design, development, and use 
of the test. 

Moss (1994, 1996) also argues that the traditional approach to reliability and 
validity is “inadequate to represent social phenomena” (ibid., 1996, p. 21). She 
further proposes a unified approach to reliability and validity, the hermeneutic 
approach, which requires the inclusion of teachers’ voices in the context of 
assessment and a dialogue among judges about the specific performance being 
evaluated. 

Furthermore, Baker and her colleagues (1993) suggest that, beyond the 
fragmented approach to reliability and validity, there are five characteristics 
that performance assessment should exhibit. These characteristics are: 

(a) meaning for students and teachers, 

(b) current standards of language performance, 

(c) demonstration of complex cognition which is applicable to important 
problem areas, 

(d) explicit criteria for judgment, and 

(e) minimizing the effects of ancillary skills that are irrelevant to the focus 
of assessment. 

5.2 Research on the Validity and Reliability of Language 
Performance Assessment 

Empirical evidence in support of the claims concerning the reliability and 
validity of language performance assessment has in general been lacking. In 
contrast, several studies found differences in language performance due to 
rater characteristics (e.g., background, experience) both in the assessment of 
speaking (e.g., Brown, 1995; Chalhoub-Deville, 1996; McNamara, 1996) and of 
writing (e.g., Lukmani, 1996; Schoonen et al., 1997; Weigle, 1998; Wolfe, 1995). 
Moreover, some studies found that rater differences survived training (Lumley 
and McNamara, 1995; McNamara and Adams, 1994; Tyndall and Kenyon, 
1995). Lumley and McNamara (1995), for example, examined the stability of 
speaking performance ratings by a group of raters on three occasions over a 
period of 20 months. Such raters participated in a training session followed by 
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rating of a series of audiotaped recordings of speaking performance to 
establish their reliability. The results of the study indicated that rater 
differences survived this training. Based on these results, the researchers 
concluded that 

One point that emerges consistently and very strongly from all of 
these analyses is the substantial variation in rater harshness, which 
training has by no means eliminated, nor even reduced to a level 
which would permit reporting of raw scores for candidate 
performance, (p. 69) 

In another line of research, some investigators found differences in students’ 
performance across different types of speaking performance tasks (e.g., 
McNamara and Lumley, 1997; Shohamy, 1994; Upshur and Turner, 1999) and 
of reading performance tasks (e.g., Riley and Lee, 1996). 

The results of the previously-mentioned studies indicate that the reliability 
and validity of performance assessment remain a major obstacle in the 
implementation of this type of assessment and that assessment specialists need 
to exert so much effort to refine the criteria as well as the procedures by which 
teachers can establish the reliability and validity of this type of assessment. 
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Chapter Six 

Summary and Conclusions 

The last ten years have seen a growth of interest in performance assessment. This 
interest has led to the development of many alternatives which teachers or 
assessors can use to elicit and assess students’ performance. These alternative 
assessment techniques highlight the assessment of language as communication, 
and integrate assessment with learning and instruction. However, for the time 
being, such techniques remain difficult and costly to use for high-stakes 
assessment (Wrigley, 2001). Furthermore, assessment specialists are still 
refining the criteria and procedures by which teachers can establish the 
reliability and validity of these alternatives (Van-Duzer, 2002). Therefore, my 
own view is that we should utilize both quantitative and qualitative assessment 
tools in a complementary fashion. In other words, it seems reasonable to 
employ performance assessment as a formative learning device throughout the 
course of the curriculum for feedback to both teachers and learners, and 
quantitative measures at the end of the curriculum for the comparison of 
students’ abilities. This conclusion is supported by Nitko (2001) in the following 
way: 



If your evaluations are based only on one type of assessment format 
(e.g., if you rely only on performance tasks), you are likely to have an 
incomplete picture of each student learning. You increase the validity 
of your assessment results by using information gathered from multiple 
assessment formats: short-answer items, objective items, and a 
variety of long-term and short-term performance tasks, (p. 244, 
emphasis in original) 

Before the implementation of performance assessment in our context, there is a 
need for teachers and students to understand performance assessment 
alternatives and their limitations. There is also a need for the development of 
performance standards, adopting performance-based instruction, and 
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supplying schools with all types of resources (e.g., tape and video recorders, 
computers, references). 

We cannot assume that students at all levels are capable of assessing their own 
performance in English as a foreign language. Nor can we assume that the 
teacher have the time to continuously assess all students’ performance in large 
classes. Therefore, I agree with assessment specialists who suggest that the teacher 
should share the responsibility for assessment with his/her students. 

Finally, to ensure the success of performance assessment, I strongly agree with 
the educators (e.g., Brualdi, 1998; Elliott, 1995; Pachler and Field, 1997) who 
suggest that performance assessment should be an integral part of teaching and 
learning because this will save the time for both teachers and students. 
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